Sonancia Crowdsourcing DataSet [Modified*]

This DataSet includes a modified* copy of the crowdsourcing annotations and the features extracted from the openSMILE tool, which was used to train audio affect models for the Sonancia system. If you end up using this dataset within your work, please cite the following paper: Lopes, Liapis and Yannakakis, "Modelling Affect for Horror Soundscapes", IEEE Transactions of Affective Computing, 2017.

[*Modification details]: The Preference Annotation files were modified such that the sounds are represented by their numeric ids (as in the Feature file) rather than their alphanumeric file names. Additionally, the Feature files were modified such that the 'name' column was removed (since alphanumeric values are not yet supported by PLT).

Preference Annotations

Preference annotations are organized in a .csv file where the name of each file refers what affect state was annotated and the dataset type. Affect states consist of: Arousal, Tension, Valence. The dataset set type consists of which sounds are included in that specific dataset. More precisely, the General dataset consists of annotations for both sounds with and without effects, JustEffects are annotations of sounds with effects applied exclusively, and the NoEffect are annotations of sounds without any effects applied. Each line consists of a preference annotation such as: [Preferred_Sound_Id],[Unpreferred_Sound_Id]


The Feature Datasets

The Low-Level Descriptors (LLD) consist of extracted features obtained using the openSMILE audio feature extraction tool using the INTERSPEECH 2009 Emotion Challenge feature set. For more information on the specific features please refer to the openSMILE documentation (http://www.audeering.com/research-and-open-source/files/openSMILE-book-latest.pdf). 

Each LLD file follows this structure: [IncludedSounds]_[FeaturesIncluded]_[PruningType]. The first parameter consists of what sounds are included in the dataset in similar fashion to the preference annotation files. The second parameter consists of what features are included in this particular dataset, where All consists of all the features that was extracted through openSMILE, and MFCC consists of exclusively the MFCC features extracted. The last parameter defines the pruning type: pruned consists of a dataset where several sounds were removed due to faulty defective audio due to aggressive effects, while the unpruned dataset consists of the entire dataset without any pruning whatsoever.


Final Comments

Due to hosting reasons the original sound files were not included. If you require these files for the application of your own feature extraction software, please contact Phil Lopes at louisphil.lopes@gmail.com. 
